On Different Approaches to Syntactic Analysis Into Bi-Lexical Dependencies An Empirical Comparison of Direct, PCFG-Based, and HPSG-Based Parsers
نویسندگان
چکیده
We compare three different approaches to parsing into syntactic, bi-lexical dependencies for English: a ‘direct’ data-driven dependency parser, a statistical phrase structure parser, and a hybrid, ‘deep’ grammar-driven parser. The analyses from the latter two are post-converted to bilexical dependencies. Through this ‘reduction’ of all three approaches to syntactic dependency parsers, we determine empirically what performance can be obtained for a common set of dependency types for English, across a broad variety of domains. In doing so, we observe what trade-offs apply along three dimensions, accuracy, efficiency, and resilience to domain variation. Our results suggest that the hand-built grammar in one of our parsers helps in both accuracy and cross-domain performance.
منابع مشابه
Bilexical Dependencies as an Intermedium for Data-Driven and HPSG-Based Parsing
Bilexical dependencies capturing asymmetrical lexical relations between heads and dependents are viewed as a practical representation of syntax that is well-suited for computation and intelligible for human readers. In the present work we use dependency representations as a bridge between data-driven and grammar-based parsing, both for cross-framework parser comparison and for parser integratio...
متن کاملUsing Lexical and Compositional Semantics to Improve HPSG Parse Selection
Using Lexical and Compositional Semantics to Improve HPSG Parse Selection Chair of the Supervisory Committee: Dr. Emily Bender University of Washington Accurate parse ranking is essential for deep linguistic processing applications and is one of the classic problems for academic research in NLP. Despite significant advances, there remains a big need for improvement, especially for domains where...
متن کاملDeriving lexical and syntactic expectation-based measures for psycholinguistic modeling via incremental top-down parsing
A number of recent publications have made use of the incremental output of stochastic parsers to derive measures of high utility for psycholinguistic modeling, following the work of Hale (2001; 2003; 2006). In this paper, we present novel methods for calculating separate lexical and syntactic surprisal measures from a single incremental parser using a lexicalized PCFG. We also present an approx...
متن کاملFrom Surface Dependencies towards Deeper Semantic Representations
In the past, a divide could be seen between ’deep’ parsers on the one hand, which construct a semantic representation out of their input, but usually have significant coverage problems, and more robust parsers on the other hand, which are usually based on a (statistical) model derived from a treebank and have larger coverage, but leave the problem of semantic interpretation to the user. More re...
متن کاملStudying impressive parameters on the performance of Persian probabilistic context free grammar parser
In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- J. Language Modelling
دوره 4 شماره
صفحات -
تاریخ انتشار 2016